Skip to content

[flang][OpenMP] Basic mapping of do concurrent ... reduce to OpenMP #146033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 11, 2025

Conversation

ergawy
Copy link
Member

@ergawy ergawy commented Jun 27, 2025

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jun 27, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Kareem Ergawy (ergawy)

Changes

Now that we have changes introduced by #145837, mapping reductions from do concurrent to OpenMP is almost trivial. This PR adds such mapping.

TODO: Add tests


Full diff: https://github.com/llvm/llvm-project/pull/146033.diff

1 Files Affected:

  • (modified) flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp (+56-27)
diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
index 709cf1d0938fa..31076f6eb328f 100644
--- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
@@ -312,6 +312,19 @@ class DoConcurrentConversion
               bool isComposite) const {
     mlir::omp::WsloopOperands wsloopClauseOps;
 
+    auto cloneFIRRegionToOMP = [&rewriter](mlir::Region &firRegion,
+                                           mlir::Region &ompRegion) {
+      if (!firRegion.empty()) {
+        rewriter.cloneRegionBefore(firRegion, ompRegion, ompRegion.begin());
+        auto firYield =
+            mlir::cast<fir::YieldOp>(ompRegion.back().getTerminator());
+        rewriter.setInsertionPoint(firYield);
+        rewriter.create<mlir::omp::YieldOp>(firYield.getLoc(),
+                                            firYield.getOperands());
+        rewriter.eraseOp(firYield);
+      }
+    };
+
     // For `local` (and `local_init`) opernads, emit corresponding `private`
     // clauses and attach these clauses to the workshare loop.
     if (!loop.getLocalVars().empty())
@@ -326,50 +339,65 @@ class DoConcurrentConversion
           TODO(localizer.getLoc(),
                "local_init conversion is not supported yet");
 
-        auto oldIP = rewriter.saveInsertionPoint();
+        mlir::OpBuilder::InsertionGuard guard(rewriter);
         rewriter.setInsertionPointAfter(localizer);
+
         auto privatizer = rewriter.create<mlir::omp::PrivateClauseOp>(
             localizer.getLoc(), sym.getLeafReference().str() + ".omp",
             localizer.getTypeAttr().getValue(),
             mlir::omp::DataSharingClauseType::Private);
 
-        if (!localizer.getInitRegion().empty()) {
-          rewriter.cloneRegionBefore(localizer.getInitRegion(),
-                                     privatizer.getInitRegion(),
-                                     privatizer.getInitRegion().begin());
-          auto firYield = mlir::cast<fir::YieldOp>(
-              privatizer.getInitRegion().back().getTerminator());
-          rewriter.setInsertionPoint(firYield);
-          rewriter.create<mlir::omp::YieldOp>(firYield.getLoc(),
-                                              firYield.getOperands());
-          rewriter.eraseOp(firYield);
-        }
-
-        if (!localizer.getDeallocRegion().empty()) {
-          rewriter.cloneRegionBefore(localizer.getDeallocRegion(),
-                                     privatizer.getDeallocRegion(),
-                                     privatizer.getDeallocRegion().begin());
-          auto firYield = mlir::cast<fir::YieldOp>(
-              privatizer.getDeallocRegion().back().getTerminator());
-          rewriter.setInsertionPoint(firYield);
-          rewriter.create<mlir::omp::YieldOp>(firYield.getLoc(),
-                                              firYield.getOperands());
-          rewriter.eraseOp(firYield);
-        }
-
-        rewriter.restoreInsertionPoint(oldIP);
+        cloneFIRRegionToOMP(localizer.getInitRegion(),
+                            privatizer.getInitRegion());
+        cloneFIRRegionToOMP(localizer.getDeallocRegion(),
+                            privatizer.getDeallocRegion());
 
         wsloopClauseOps.privateVars.push_back(op);
         wsloopClauseOps.privateSyms.push_back(
             mlir::SymbolRefAttr::get(privatizer));
       }
 
+    if (!loop.getReduceVars().empty()) {
+      for (auto [op, byRef, sym, arg] : llvm::zip_equal(
+               loop.getReduceVars(), loop.getReduceByrefAttr().asArrayRef(),
+               loop.getReduceSymsAttr().getAsRange<mlir::SymbolRefAttr>(),
+               loop.getRegionReduceArgs())) {
+        auto firReducer =
+            mlir::SymbolTable::lookupNearestSymbolFrom<fir::DeclareReductionOp>(
+                loop, sym);
+
+        mlir::OpBuilder::InsertionGuard guard(rewriter);
+        rewriter.setInsertionPointAfter(firReducer);
+
+        auto ompReducer = rewriter.create<mlir::omp::DeclareReductionOp>(
+            firReducer.getLoc(), sym.getLeafReference().str() + ".omp",
+            firReducer.getTypeAttr().getValue());
+
+        cloneFIRRegionToOMP(firReducer.getAllocRegion(),
+                            ompReducer.getAllocRegion());
+        cloneFIRRegionToOMP(firReducer.getInitializerRegion(),
+                            ompReducer.getInitializerRegion());
+        cloneFIRRegionToOMP(firReducer.getReductionRegion(),
+                            ompReducer.getReductionRegion());
+        cloneFIRRegionToOMP(firReducer.getAtomicReductionRegion(),
+                            ompReducer.getAtomicReductionRegion());
+        cloneFIRRegionToOMP(firReducer.getCleanupRegion(),
+                            ompReducer.getCleanupRegion());
+
+        wsloopClauseOps.reductionVars.push_back(op);
+        wsloopClauseOps.reductionByref.push_back(byRef);
+        wsloopClauseOps.reductionSyms.push_back(
+            mlir::SymbolRefAttr::get(ompReducer));
+      }
+    }
+
     auto wsloopOp =
         rewriter.create<mlir::omp::WsloopOp>(loop.getLoc(), wsloopClauseOps);
     wsloopOp.setComposite(isComposite);
 
     Fortran::common::openmp::EntryBlockArgs wsloopArgs;
     wsloopArgs.priv.vars = wsloopClauseOps.privateVars;
+    wsloopArgs.reduction.vars = wsloopClauseOps.reductionVars;
     Fortran::common::openmp::genEntryBlock(rewriter, wsloopArgs,
                                            wsloopOp.getRegion());
 
@@ -393,7 +421,8 @@ class DoConcurrentConversion
                              clauseOps.loopLowerBounds.size())))
       rewriter.replaceAllUsesWith(loopNestArg, wsloopArg);
 
-    for (unsigned i = 0; i < loop.getLocalVars().size(); ++i)
+    for (unsigned i = 0;
+         i < loop.getLocalVars().size() + loop.getReduceVars().size(); ++i)
       loopNestOp.getRegion().eraseArgument(clauseOps.loopLowerBounds.size());
 
     return loopNestOp;

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please could you add a test that covers every region.

@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 5f665c9 to 105ac7a Compare June 30, 2025 04:19
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_8 branch from c99a502 to 34a5df1 Compare June 30, 2025 04:23
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 105ac7a to 3e2bdf5 Compare June 30, 2025 04:40
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_8 branch from 34a5df1 to 185b43f Compare June 30, 2025 05:28
@ergawy
Copy link
Member Author

ergawy commented Jun 30, 2025

LGTM but please could you add a test that covers every region.

Thanks for the review. Done.

Copy link
Contributor

@mjklemm mjklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 3e2bdf5 to 15ed831 Compare July 2, 2025 15:42
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_8 branch from 185b43f to 8c25fe7 Compare July 2, 2025 15:47
Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 15ed831 to 53f9c0d Compare July 9, 2025 07:18
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_8 branch from 8c25fe7 to 39e6ed7 Compare July 9, 2025 07:18
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 10, 2025
…lled in OpenMP and OpenACC

This PR proposes re-modelling `reduce` specifiers to match OpenMP and OpenACC. In particular, this PR includes the following:

* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's `omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do concurrent` just like we do for OpenMP. To do this, the `ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using the new reduction model.

Unfortunately, this is a big PR that would be difficult to divide up in smaller parts because the bottom of the changes are the `fir` table-gen changes to `do concurrent`. However, doing these MLIR changes cascades to the other parts that have to be modified to not break things.

This PR goes in the same direction we went for `private/local` speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects are modelled in essentially the same way which makes mapping between them more trivial, hopefully.

PR stack:
- llvm#145837 (this one)
- llvm#146025
- llvm#146028
- llvm#146033
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 10, 2025
With llvm#145837, the `ReductionProcessor` component is now used by both OpenMP and `do concurrent`. Therefore, this PR moves it to a shared location: `flang/Lower/Support`.

PR stack:
- llvm#145837
- llvm#146025 (this one)
- llvm#146028
- llvm#146033
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 10, 2025
Re-organizes the op definition a little bit and removes a method that does not add much value to the API.

PR stack:
- llvm#145837
- llvm#146025
- llvm#146028 (this one)
- llvm#146033
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 10, 2025
Now that we have changes introduced by llvm#145837, mapping reductions from `do concurrent` to OpenMP is almost trivial. This PR adds such mapping.

PR stack:
- llvm#145837
- llvm#146025
- llvm#146028
- llvm#146033 (this one)
ergawy added a commit that referenced this pull request Jul 11, 2025
…lled in OpenMP and OpenACC (#145837)

This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:

* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.

Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.

This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.

PR stack:
- #145837 (this one)
- #146025
- #146028
- #146033
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 53f9c0d to 8e67de7 Compare July 11, 2025 04:41
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 11, 2025
…ns are modelled in OpenMP and OpenACC (#145837)

This PR proposes re-modelling `reduce` specifiers to match OpenMP and
OpenACC. In particular, this PR includes the following:

* A new `fir` op: `fir.delcare_reduction` which is identical to OpenMP's
`omp.declare_reduction` op.
* Updating the `reduce` clause on `fir.do_concurrent.loop` to use the
new op.
* Re-uses the `ReductionProcessor` component to emit reductions for `do
concurrent` just like we do for OpenMP. To do this, the
`ReductionProcessor` had to be refactored to be more generalized.
* Upates mapping `do concurrent` to `fir.loop ... unordered` nests using
the new reduction model.

Unfortunately, this is a big PR that would be difficult to divide up in
smaller parts because the bottom of the changes are the `fir` table-gen
changes to `do concurrent`. However, doing these MLIR changes cascades
to the other parts that have to be modified to not break things.

This PR goes in the same direction we went for `private/local`
speicifiers. Now the `do concurrent` and OpenMP (and OpenACC) dialects
are modelled in essentially the same way which makes mapping between
them more trivial, hopefully.

PR stack:
- llvm/llvm-project#145837 (this one)
- llvm/llvm-project#146025
- llvm/llvm-project#146028
- llvm/llvm-project#146033
ergawy added a commit that referenced this pull request Jul 11, 2025
With #145837, the `ReductionProcessor` component is now used by both
OpenMP and `do concurrent`. Therefore, this PR moves it to a shared
location: `flang/Lower/Support`.

PR stack:
- #145837
- #146025 (this one)
- #146028
- #146033
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_7 branch from 8e67de7 to df63fea Compare July 11, 2025 05:45
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 11, 2025
… (#146025)

With #145837, the `ReductionProcessor` component is now used by both
OpenMP and `do concurrent`. Therefore, this PR moves it to a shared
location: `flang/Lower/Support`.

PR stack:
- llvm/llvm-project#145837
- llvm/llvm-project#146025 (this one)
- llvm/llvm-project#146028
- llvm/llvm-project#146033
ergawy added a commit that referenced this pull request Jul 11, 2025
…146028)

Re-organizes the op definition a little bit and removes a method that
does not add much value to the API.

PR stack:
- #145837
- #146025
- #146028 (this one)
- #146033
Base automatically changed from users/ergawy/convert_locality_specs_to_clauses_7 to main July 11, 2025 06:30
ergawy added 2 commits July 11, 2025 01:31
Now that we have changes introduced by #145837, mapping reductions from
`do concurrent` to OpenMP is almost trivial. This PR adds such mapping.
@ergawy ergawy force-pushed the users/ergawy/convert_locality_specs_to_clauses_8 branch from 39e6ed7 to c1834bb Compare July 11, 2025 06:31
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 11, 2025
…defintion (#146028)

Re-organizes the op definition a little bit and removes a method that
does not add much value to the API.

PR stack:
- llvm/llvm-project#145837
- llvm/llvm-project#146025
- llvm/llvm-project#146028 (this one)
- llvm/llvm-project#146033
@ergawy ergawy merged commit 0e9b7b0 into main Jul 11, 2025
9 checks passed
@ergawy ergawy deleted the users/ergawy/convert_locality_specs_to_clauses_8 branch July 11, 2025 07:19
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 11, 2025
…` to OpenMP (#146033)

Now that we have changes introduced by #145837, mapping reductions from
`do concurrent` to OpenMP is almost trivial. This PR adds such mapping.

PR stack:
- llvm/llvm-project#145837
- llvm/llvm-project#146025
- llvm/llvm-project#146028
- llvm/llvm-project#146033 (this one)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants